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1. Introduction 

In many practical situations, e.g. functional Magnetic Resonance Imaging (fMRI) 
or microarray data, the problem of testing simultaneously a large number m of 
null hypotheses arises. Since Neyman-Pearson's approach is the most commonly 
used strategy for single testing, many researches have focused on generalizing 
this approach to the multiple testing case. First, one should choose a global 
type I error to be controlled, as the probability of making at least one false 
discovery (family-wise error rate, FWER) or more recently the mean propor- 
tion of false discoveries among all the discoveries (false discovery rate, FDR, see 
Bcnjamini and Hocliborg (1995)). Second, one should build a procedure that 
controls the so-chosen global type I error rate. For instance, Benjamini and 
Hochberg (1995) proved that the linear step-up procedure (LSU) controls the 
FDR when the underlying tests are independent. Third, one should show that 
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the obtained procedure has good power performance, the power being generaUy 
defined as the expected number of true discoveries. 

To our knowledge, while the two first points above are widely studied (e.g. 
building FDR controlling procedures, see e.g. Bcnjaniini and Yekuticli (2001); 
Storey (2002); Sarkar (2002))), the last point is most of the time evaluated 
with simulations, without a full theoretical support. Only few works have stud- 
ied rigorously the optimality of certain classes of multiple testing procedures 
(see Lehmann et al. (2005); Wasserman and Rocdcr (2006); Rubin et al. (2006); 
Storey (2007); Finncr et al. (2009)). 

Maximizing the power while controlling the FDR remains a difficult task, 
because the FDR involves a random denominator (the number of discoveries). 
The present paper gives a contribution to the latter maximization problem, in 
the simple case where the null and alternative distributions are known. This 
framework is natural for the power maximization of tests, as it was also used in 
Neyman-Pearson's lemma for single testing. Although leading to oracle proce- 
dures, it can be used in practice as soon as the null and alternative distributions 
are estimated or guessed reasonably accurately from independent data. 

More formally, assume that each hypothesis is tested using a test statistic, 
that can then be transformed into a p- value pi, and denote by Fi the alternative 
c.d.f. of Pi. In general, the i^i's can be possibly very different (e.g. with heteroge- 
neous underlying data) and the p-values cannot be considered interchangeably. 
Therefore, a p-value weighting approach seems appropriate to improve the per- 
formance of a multiple testing procedure. This technic, that can be traced back 
to Holm (1979), consists in replacing in input each original p-value pi by the 
weighted p- value p[ = Pi/wi for some weight vector (wi, . . . ,Wm) summing to 
m. Here, we focus on the weighted version of the LSU procedure that was pro- 
posed in Genovcsc et al. (2006) (see also Blauchard and Roquain (2008)). In 
the latter paper, it was demonstrated that the weighted LSU still controls the 
FDR for any weighting (under independence between the p- values), and that 
some of these procedures could improve the power of the LSU asymptotically. 
In the present paper, we aim to find the most powerful procedure among all the 
weighted LSU procedures, or more precisely, to find a procedure that mimics the 
best procedure among the weighted LSU procedures. Moreover, this procedure 
should be computable from the p- values distributions, i.e. the F^'s. 

When using the weighted version of the FWER-controlling Bonferroni proce- 
dure, Wasserman and Rocdcr (2006) (see also Rul)in (^t al. (2006)) have found 
the optimal weighting. In Storey (2007), an optimal procedure was also pro- 
posed, maximizing the expected number of true discoveries while controlling 
the expected number of false discoveries. All these procedures use deterministic 
thresholds, which make the power maximization feasible. However, in the case 
of the FDR-controlling weighted LSU, the threshold depends on the final num- 
ber of discoveries and a power maximization seems very difficult to make, even 
in the asymptotic framework where the number of p-values m tends to infinity 
(see Genovcsc et al. (2006)). 

The main idea of this paper is to find the optimal weights simultaneously for 
all the possible rejection proportions u e [0, 1]. These multi- weights are then 
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collected in optimal weight functions u i— > W*{u) which in turn are sequentially 
integrated in a step-up procedure. While the LSU procedure uses as threshold 
function u '—^ au, we find that the new procedure uses a threshold function 
u > auW*{u) not necessarily linear (and depending on the F/s). 

The new procedure, called "optimal multi-weighted step-up procedure" , will 
be presented in detail in Section 3. In Section 4, we show that it enjoys the 
following properties: 

(i) FDR control for a finite number of hypotheses, up to slight modifications; 

(ii) power optimality for a finite number of hypotheses, up to error terms; 

(iii) power optimality without error term and FDR control without modifi- 
cation when the number of hypotheses m tends to infinity (in a specific 
asymptotic setting). 

These results are established in two different (classical) models of p- values, both 
assuming independence between the p- values. The results (ii) and (iii) addition- 
ally use that the i^^'s are strictly concave functions and that the maximization 
of the power at any rejection proportion is feasible, which remain quite mild 
assumptions. 

In Section 5, we present a simulation study which exhibits the behavior of the 
new procedure when the F^'s are correctly specified or misspecified. Section 5 
discusses some applications and our conclusions are given in Section 7. All our 
results are proved in Section 8, while some technical parts are gathered in Ap- 
pendix. Our proofs mainly use the "self-consistency condition" introduced in 
Blanchard and Roquain (2008) (see also Finner et al. (2009)) and Hocffding's 
inequality (see Hoeffding (1963)). 

2. Preliminaries 

2.1. Models for the p-values 

Let us first define the two different models for the p-values that will be used 
throughout the paper. 

We consider a finite set of m null hypotheses on a probability space and we 
let Hi := (resp. 1) if the i-th null hypothesis is true (resp. false). Letting 
H := (ff»)i<,<™ e {0, 1}"', we denote by Ho -.^ {i £ {I, . . . ,m} \ H, ^ 0} the 
set corresponding to the true null hypotheses and by mo |7io| its cardinal. 
Analogously, we define Hi :— {i & {1, • ■ • \ Hi — 1} and mi :— \Hi\ for 
the alternative hypotheses. Since Tii is the complement of Hq in {1, ... , m}, we 
have mi — m — ttiq. The proportion of true nulls (resp. false nulls) is denoted 
by ttq :— mo/m (resp. tti := mi/m) as usual. We suppose that for the «-th null 
hypothesis it is given ap-value pi i.e. a measurable function from the observation 
space into [0, 1] such that the distribution of pi is uniform on [0, 1] when the 
i-th null hypothesis is true: 



Vi e Ho, V< e [0,1], P(p, <t)^t. 



(1) 
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Under the alternative, we denote by Fi the cumulative distribution function 
oi pf. yi e Hi, yt e [0,1], F,{t) P{p, < t). In our setting, the F^'s are 
allowed to be different and we denote F := {Fi)ii=.fi^ the family of alternative 
c.d.f.'s. The p-values are assumed mutually independent. The latter model has 
parameters (H, F) and will be referred troughout the paper as the conditional 
model (because it uses a fixed vector H). 

Additionally, we will consider the so-called random effects model (see e.g. 
Efron ct al. (2001); Storey (2003); Genovese and Wasserman (2004)). In this 
model, H is generated independently from all other random variables, from m 

1.1. d. Bernoulli priors. The probability for a null to be true (resp. false) is denoted 
by TTo := V{Hi = 0) e (0, 1) (resp. tti := 1 — tto). Then, the p- values are assumed 
to follow the conditional model conditionally to H: the p-values are mutually 
independent conditional to H and each pi is uniform conditional to 7?^ = (i.e. 
satisfies (1) conditional to Hi = 0) and has for alternative c.d.f. Fi conditional 
to ffi = 1. As a consequence, unconditionally, the p- values are independent and 
for i = 1, . . . , m, the c.d.f. of each p- value ist^ Trot+TriFi{t). This model has 
for parameters (ttq, F) where F — (Fi)i<i<m is the family of alternative c.d.f.'s. 
The latter model will be referred trough the paper as the unconditional model. 

2.2. Assumptions and notation 

We introduce the following possible regularity assumptions on the parameter F 
of each model, the derivative of Fi being denoted by fi. 

the FiS are continuous, strictly concave functions on [0, 1]; (Al) 

the FiS are twice differentiable on (0, 1); (A2) 

the functions i ^ fi{0~^) and i i— > /i(l^) are constant (^3) 

for each i,j, limy^/^(o+) //^(y)//,"^^) exists in [0, +oo]. (A4) 

As illustration, the assumptions (yll)-(A4) are all satisfied in the one-sided 
testing Gaussian case where we test for any i the null "/i^ = 0" against "/i^ > 0" 
from a Gaussian test statistic of mean /i^ and variance 1. In that case, we have 

F,(x) - $($"'(.t) - M,:) and /,(x) = exp (^~\x) ^ y) }: (2) 

where we denoted $(z) ■.= P[Z> z] for Z - 7V(0, 1). 

Finally, for any non-decreasing function F : [0, 1] — > [0, 1] we denote 

I{F) := sup{u e [0, 1] I F{u) > u}, 

J{F) := sup{M e [0, 1] I Vu' < u, F{u') > u'}, 

and for A > 0, 

X+(F) {1(F) + A) - F{I{F) + A) when A < 1 - I{F), 

X-{F) F{X{F) - A) - (X(F) - A) when A < T{F). 

We easily check that 1(F) and J{F) are maxima and that F{I{F)) = T{F) 
and F{J{F))=J{F). 
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2.3. Multiple testing procedures, FDR and power 

A multiple testing procedure R is defined as an algorithm which, from the data, 
aims to reject part of the nuU hypotheses. Below, we will consider, as is usually 
the case, multiple testing procedures which can be written as a function of the 
family of p- values p — {pi,i G {1, . . . , m}). More formally, we define a multiple 
testing procedure as a measurable function R, which takes as input a realization 
of the p- value family p £ [0, 1]"^ and which returns a subset R{p) of {1, ... , to}, 
corresponding to the rejected hypotheses (i.e. i € -R(p) means that the i-th. 
hypothesis is rejected by R). 

As introduced by Bonjamini and Hochberg (1995), the false discovery rate 
(FDR) of a multiple testing procedure is defined as the mean proportion of true 
hypotheses in the set of the rejected hypotheses: 



FDR(i?) = E 



|Honi?(p)| 



|i?(p)|Vl 



(3) 



where | • | denotes the cardinality function. Of course, the FDR in (3) depends 
on the model chosen for the p- values. In particular, the FDR in the conditional 
model involves an expectation taken conditionally to H, whereas the FDR in the 
unconditional model additionally uses an averaging over H. It is worth noticing 
that, if a procedure controls the FDR in the conditional model, that is condi- 
tionally to any value of H g {0, 1}"*, it controls also the FDR unconditionally. 

Finally, we use the standard power criterium equal to the mean proportion 
of correctly rejected hypotheses, that is, 

Pow(i?) = m-^E[\ni n R{p) I] . (4) 

In the notation below, we will sometimes drop the explicit dependency in p 
for short, writing e.g. R instead of R{p). 



2.4. Weighted linear step-up procedures 

Let us consider w = {wi)i a vector of non-negative real numbers such that 
Y^iLi = m, called here a weight vector, and consider the weighted p-values 
Pi = Pi/wi, ordered as: p'^j^-j < • • • < p'^^-j with the convention p'^^^ = 0. 

As introduced by Genovese et al. (2006), the weighted linear step-up procedure 
associated to w, denoted here by LSU(w), rejects the i-th hypothesis if p^ < au, 
with 

u = TO^^ maxj?- e {0, 1, . . . , to} | p'^^-, < ar/m}. (5) 

In particular, the procedure LSU(w) using Vj,u'i = 1 corresponds to the stan- 
dard linear step-up procedure of Beiija.mini and Hochberg (1995), denoted here 
by LSU. Letting Gw(w) = X^I^i ^{Pi — OiWiu}, the rejection proportion u 
can equivalently be defined as 



u — X(Gw), 



(6) 
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using the notation of Section 2.2. Contrary to (5), expression (6) does not make 
any specific use of the reordered p- values p'^^y . . . ,P(„j), so that it is generally 
more convenient from a mathematical point of view. 

For any choice of weight vector w, Gcnovcsc et al. (2006) proved that the 
weighted linear step- up procedure controls the FDR at level Oim~^J2ii^ ~ 
HijWi < a in the conditional model and (thus) at level ttoQ! < a in the im- 
conditional model. 

3. New approach 

We present in this section a new family of multiple testing procedures, called 
multi-weighted procedures. We start by motivating their introduction from the 
power optimization problem among the family of weighted linear step-up pro- 
cedures. 

3.1. Weight functions 

Following Gcnovcsc ct al. (2000), the explicit computation of the power of the 
LSU(w) is a difficult task (even asymptotically): it depends on the final pro- 
portion of rejections of the procedure u — I(Gvv), which is a random variable 
itself depending on w. Therefore, we propose here to perform the optimization 
for each fixed rejection proportion u which in turn leads to a family of optimal 
weight vectors depending on u, < m < 1. 

First, define the power of the procedure that thresholds each p- value pi at 
level awiu: 

Pow„(w) := Pow({« I p, < aw,u}), (7) 

corresponding intuitively to the "power of the LSU(w) at rejection propor- 
tion u" . 

Second, define a weight (vector) function as a function W : m e (0, 1] i— > 
W(u) = {Wi{u))i e (R+)™ such that each W(m) is a weight vector, that is, 
Vu G (0, 1], Y^^i Wi{u) = m and such that the following property holds 

Vz G {1, . . . , m}, u 1-^ Wi{u) u is nondecreasing on (0, 1]. (8) 

Additionally, a weight function is said continuous if for all i, u Cz (0, 1] Wi{u) 
are continuous functions. 

Definition 3.1. Any weight function W"^ solving simultaneously the maximiza- 
tion problems: 

Vm e (0, 1], Pow,tj(W*(u)) = max {Pow„(w), w weight vector}, (9) 

is called the optimal weight function. 

Note that W* is called here abusively "the" optimal weight function even if 
it is not proved to be unique. Of course the optimal weight function depends 
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on the model chosen for the p-values. The foUowing proposition gives (strong) 
sufficient conditions for existence and unicity of the optimal weight function in 
the different models described in Section 2.1. 

Proposition 3.2. Assume {A1)-{A2)-{A3) and denote the derivative of Fi by 
fi . Then the weight function W* satisfying (9) exists and is unique in either of 
the following cases: 

• In the conditional model, if a < tti, with for all u G (0, 1], 



In each case, y*{u) is defined as the unique element providing "YTiLi ^ti"^) ^ 
Moreover, in both models, the weight function W* is continuous, and assuming 
in addition (A4), the limits Wj*(0+) exist for all i. 

The proof, which is based on similar arguments than those proposed in 
Rubin ct al. (2006) and Wasserman and Roedcr (2006), is given in Section 8.6. 
Of course, the optimal weight function depends on the parameters of the model: 
on (H, F) in the conditional model and on F (only) in the unconditional model. 

For instance, when the j>- values are generated from the Gaussian model (2), 
the optimal weight function in the unconditional model is given by 



where c{u) is the unique element of M such that X)"=i ^i*(") = It there- 
fore only depends on the vector of alternative means /i = {p.i)i<i<m- Figure 1 
displays the optimal weight vectors W(u) for a particular choice of means and 
different values of u. We observe that W(ii) strongly depends on u: for u = 1, 
the weight vector is larger for small means, whereas as u decreases, the weight 
vector is maximum on larger means. In particular, for small u, the weighting is 
close to zero for the smallest means, because they produce p- values much larger 
than au (with high probability). 

The Gaussian formula (11) can be also suitable for test statistics "close to 
be Gaussian", namely for locally uniform asymptotically normal test statis- 
tics (see Chapter 14 of van dor Vaart (199S) and Section 4.3 and Section 7 of 
Roquain and van dc Wicl (2008)). This is the case for instance for the Mann- 
Whitney test statistic. 

3.2. Multi-weighted procedures 

From the previous section, we have now to integrate several weight vectors in 
a single multiple testing procedure, or more precisely to use a weight vector w 
which may depend on u. For this, we extend the definition of weighted linear 
procedures to the case of multi- weighted procedures. 




(10) 





(11) 
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Fig 1. Plot of the optimal weights {W*(u))i in function of the alternative means foru = 

1/m (solid), u = 10/m (dashed- dotted), u = 100/m (dotted), u = 1 (dashed). Unconditional 
model and Gaussian one-sided case with m = 1000, a = 0.05, m = bi/m for i = 1, . . . ,m. 
Each curve is normalized to have a maximum equal to 1. 



First, we define the threshold collection A = {lS.i{u))i^u associated to a given 
weight function W(-) = {Wi{-))i by V« e {1, . . . , m}, V?i G [0, 1], 

\{u) := a W,{u) M if u > and A,(0) := 0. 

Conversely, given any threshold collection A = (Aj(w))i,„ such that each Aj is 
nonnegative, nondecreasing on [0, 1] and such that Vu G (0, 1], ^iiu) = 

au, we define the weight function W = {Wi{u))i^u associated to A by Vi € 
{1, . . . ,m}, Vu e (0, 1], Wi{u) := Aj(w)/(Qu). As a consequence, the threshold 
collection A and the weight function W are one to one associated. 

Definition 3.3. Consider a weight function W(-) = {Wi{-))i and its associ- 
ated threshold collection A. The multi-weighted step-up procedure with weight 
function W, denoted by SU(W), rejects the i-th. null hypothesis if Pi < Ai(u), 
where ^ 

u = J(Gw), (12) 

and where we denoted Gw{u) := < ^i(^)} for all u G [0, 1]. 

In particular, in the case wlica'c for all u, Wi{u) = Wi is independent of u, the 
procedure SU(W) reduces to LSU(w). More generally, the above definition of 
SU(W) allows to choose thresholding Ai{u) not linear in u. 

As for LSU(v^f), the multi-weighted procedure SU(W) can also be derived 
from a re-ordering based algorithm. The main difference is that the original p- 
values are ordered in several ways, because several weighting are used. Namely, 
if for r > 1, Qr denotes the r-th smallest W(r/m)-weighted p-value i.e. is equal 
to p'^^^ where Vi,p- = Pi/Wi{r/m) and putting qo = 0, we have 

u = max{r e {0, 1, . . . , m} | Qr < ar/m}. 
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Similarly to the step-up case, we can define the multi-weighted step-down pro- 
cedure with weight function W (and associated threshold collection A), denoted 
by SD(W), as rejecting the i-th null hypothesis if pi < Ai{u), where 

u-J^(Gw), (13) 

or equivalently, u = m^^ niax{r S {0, 1, . . . , m} \ Vr' < r, g^' < ar'/m}. 

Remark that the procedures SU(W) and SD(W) only use the values of W(tt) 
for u G 2/m, . . . , 1}, which makes them easily computable. We refer the 

reader to Appendix A for explicit algorithmic versions of the procedures SU(W) 
and SD(W). 

The particular multi- weighted step-up procedure SU(W*) using the opti- 
mal weight function W* is called the optimal multi-weighted procedure. From 
an intuitive point of view, since this weighting maximizes the power at any 
rejection proportion, the latter procedure should be more powerful than any 
standard weighted procedure LSU(vif). One of the goal of Section 4 is to state 
this optimality result formally. 

Finally, let us remark that in the unconditional model and under the as- 
sumptions and notation of Proposition 3.2, the optimal multi-weighted step-up 
procedure may be written under the following form: reject the i-th hypothesis if 
fiijPi) ^ y*{u), where fi is the (decreasing) alternative density of and y*{u) 
is adjusted from all the p- values, the fi's and the pre-specified level a. As a 
consequence, this procedure is based on individual tests of Neyman-Pearson's 
type (the observed variables being restricted to the p- values) . 

4. Main results 

We present in this section the main properties of the multi-weighted procedures. 
First, the finite-sample FDR control of SU(W) for any weight function W, 
up to slight modifications. Second, the finite-sample power optimality of the 
procedure SU(W*) (using the optimal weight function), up to some small error 
terms. Third, a consistency result, proving that the latter slight modifications 
are unnecessary and that error terms vanish, in a particular asymptotic setting 
where m tends to infinity. 

4-1- Finite-sample FDR control 

First, let us recall that for any choice of weight vector w = (wi, . . . ,Wrn)j 
the weighted linear step-up procedures LSU(w) controls the FDR at level 
aTO~^^j(l — Hi)wi < a in the conditional model and at level noa < a in 
the unconditional model (see Gcn(n'cse ct al. (20()fi); Blancliard and Roquain 
(2008)). These controls are non-asymptotic, in the sense that they are valid for 
any finite m > 2. 

Unfortunately, the procedure SU(W) cannot be proved to control the FDR at 
level a for any choice of weight function W and for any m > 2. In Appendix C, 



E. Roquain and M. van de Wiel/Optimal weighting for FDR control 



687 



a (least favorable) choice of weight function is given when m — 2, for which 
FDR(SU(W)) slightly exceeds a. Therefore, in order to obtain rigorous FDR 
control for each m and any weight function, we need to slightly correct SU(W). 

Theorem 4.1. Consider W(-) = (W^j('))i '^'^J/ weight function. Then for any 
finite m > 2, the two following procedures 

• SU(W) with m{u) = W^{u)/{l + aW^{l)), 

• SD(W) with W^{u) = W,{u)/{l + auW,{u)), 

have their FDR less than or equal to 

a max <m ^"S^ {I — Hi)Wi{k/m) > < a 

l<k<ni I ^ — ^ I 

in the conditional model. As a consequence, their FDR are less than or equal to 
oE (niaxi<i;<„j {m~^ ^ Hi)Wi{k/m)^'j < a in the unconditional model. 

The proof of Theorem 4.1 is given in Section 8.3. Note that this result covers 
the earlier result of Gcnovcsc ct al. (2006); Blanchard and Roquain (2008), by 
taking Wi{u) = Wi constant in u. 

Since from (8), we have auWi{u) < aWi{l), both modifications of SU(W) 
proposed above should be not too large when aWi(l) is close to (e.g. when 
a is small). Furthermore, while the correction proposed in the weighting of the 
step-up procedure is more conservative than the one of the step-down, a step-up 
procedure is always more powerful than a step-down procedure (for the same 
threshold collection). Therefore, in general, no modified procedure dominates 
the other. Nevertheless, in the particular simulation setting of Section 5, we will 
see that the step-down modification appears to be better. 

When using the optimal weight function W"^, Theorem 4.1 provides two mod- 
ifications of the optimal multi-weighted procedure SU(W*) that control the 
FDR. More importantly, it shows that any misspecification in W* (e.g. in the 
model parameters) still leads to the correct FDR control. This is a crucial point 
in practice. 

Explicit finite-sample bounds for the FDR of SU(W) - the step-up procedure 
without modification - are given in Proposition 8.4 in the unconditional model 
(see Section 8.5). It shows that FDR(SU(W)) should be close to noa when m 
is large, so that the modifications of Theorem 4.1 are not needed anymore in 
that case. We will develop the resulting FDR consistency result more formally 
in Section 4.3 under some asymptotic conditions and for the optimal weighting. 

4-2. Finite-sample power optimality 

For a given weight function W of associated threshold collection A, let us denote 



m 

Gw(w) := E [Gw(w)j = m-^^V[p, < Ai(u)] , (14) 

i=l 
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being the mean proportion of rejections at levels {Ai(u))i and define similarly 
G'w(u) for a weight vector w. Then the following theorem holds: 

Theorem 4.2. In the unconditional model, assume that F satisfies {Al) and 
consider a weight function W* which maximizes the power at every proportion 
rejection, i.e. satisfying (9). Consider A > with X < 7ro(l — a). Then we have 
for any finite m, 

Fow(SU(W*)) >max{Pow;(LSU(w)) -e(m,X+(Gw))} 

- s{m,J-{Gw-))l{X < X(GwO} - 2A(1 - aTTo), (15) 

where Va; G K, e{m, x) := Trim^ exp {-2m( x — m ^)_|_} and where the maximum 
is taken over all the weight vectors w. Moreover, we have 2^{G-w*) > when 
A < X(GwO andJ+{G^) > 0. 

The proof is made in Section 8.2. Expression (15) can be seen as a non- 
asymptotic "oracle inequality" , stating that the power of the optimal multi- 
weighted procedure is close to the power of the best weighted linear step-up 
procedure. This finite-sample optimality result makes sense because SU(W*), as 
all the weighted linear step-up procedures, controls the FDR non-asymptotically 
at level a (up to the slight modifications presented in Section 4.1). 

In Theorem 4.2, condition A < 7ro(l — a) (resp. A < X(Gw*)) ensures that 
X^(Gw) (resp. X^(Gw*)) '^^11 defined. Moreover, in (15), A should be chosen 
such that the errors terms £(to,X^(Gw)), £{m,X^ {Gw*)) and 2A(1 — airo) are 
as small as possible. From an asymptotic point of view, assuming that the 
quantities X_^(Gw) and X^(Gw) are bounded away from when m tends to 
infinity (for any fixed A), the error terms tend to zero by taking successively 
m tending to infinity and A tending to zero. However, the best choice A = A„i 
depends on the parameter F and seems quite difficult to derive under an explicit 
form (and so are the corresponding convergence rates in (15)). 

The next section presents sufficient asymptotic conditions making 2^ (Gw* ) 
and I^(Gw) bounded away from when m tends to infinity, so that the error 
terms will asymptotically vanish in oracle inequality (15). 

4-3. Consistency 

We propose in this section an asymptotic framework in which the optimality of 
SU(W*) and its FDR control hold when m tends to infinity, without modifica- 
tion or error term. 

First, we define the asymptotic setting. For all m > 2, we consider the m- 
unconditional model, where the m p- values are chosen as the m first p- values of 
an infinite sequence of independent p- values {pi)i>i, each p-value pi having the 
c.d.f. TTot + TTiFiit), for a given infinite sequence of c.d.f.'s F — (i^i)i>i. In this 
context, the weight functions depend on m, and we underline this dependence 
in the notation, by denoting W^™^ instead of W (and w'™^ instead of w). 
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Second, we define a converging weight function sequence (W*-'"-')^ as a se- 
quence of weight functions such that the associated function sequence (G-wtm) )m 
(defined in (14)) converges point-wise (on [0, 1]). For short, we will often use the 
notation G°° for the limit function of (G-w(m))m- 

Theorem 4.3. Consider the above asymptotic framework in which F is assumed 
to satisfy (Al) and consider a class W of converging weight function sequences. 
Let (W*'('"))m be a sequence of weight functions such that for all m, W*'^™^ 
maximizes the power at every proportion rejection in the m-unconditional model 
(i.e. satisfies (9)^. For the sequence (W*'*-™-*)™; assumed to lie in W, and for 
any weight vector sequence (w'"*-')™ belonging to W, we additionally assume 
that the associated limit function G°° is continuous and satisfies X^{G°°) > 
for A < X(G°°). Then the multi-weighted procedure SU(W*''-™-') satisfies 

limPow(SU(W*'(™))) > max (lim Pow(LSU(w(")))| , (16) 

m (w('")),„ I- ™ J 

the maximum above being taken over any sequence of weight vectors (w^'"))^ 
belonging to W. Moreover, we have 

limfDi?(SU(W*^("'))) < TToa, (17) 

m 

assuming either that 2{G°°) > (with G°° — lim„i G-\Y*,(m) J or that we have 
lim„ lim,„ YZ=i suPo<«<n-i {^^'^'"^(u)} = 1- 
Theorem 4.3 is proved in Section 8.5. 

Under the conditions of Theorem 4.3, inequalities (16) and (17) imply that 
SU(W*) is asymptotically more powerful than any weighted linear step-up pro- 
cedure (in a certain class of converging weight vector sequences) while having the 
same asymptotic FDR control. Since the uniform weighting sequence w^™''^ = 1 
is always converging (with a continuous strictly concave limit function, from 
(Al)), it can always be added in the class W. As a consequence, the proce- 
dure SU(W*) always improves the original LSU asymptotically. However, this 
should be balanced with the fact that SU(W*) uses the true parameters of the 
model, whereas LSU does not. 

To satisfy the assumptions of Theorem 4.3, we have to choose a convenient 
class of converging weighting sequences W, containing the optimal weighting 
sequence. We give below two examples of such choice when F is assumed to 
have a particular structure. 

A first example is the case of clustered p-values: consider a parameter F 
satisfying (Al) and such that Fi is equal to Fa (resp. Fb) for i e 5^™^ (resp. 
i € S^^), where {s'^\s^^} forms a (deterministic) partition of {1, . . . ,m} 
(this model may of course be generalized to the case K > 2 clusters). For 
simplicity, we assume that the proportion of p-values t:a — in cluster 

5^™^ (resp. ttb = |5^"'|/m in cluster 5^^) does not depend on m (this holds 
up to take a subsequence of m). In this context, we merely check that a weight 
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vector maximizing the power (at a given rejection proportion) has the same 
weight within a cluster. It is therefore natural to consider the following class of 
weighting: 

W = {(W^'"))^^ Vm, Wj;""^ ^ Wa (resp. Wb) for i € 5^""^ (resp. i G ^jj"^), 
for WAi-),WBi-) > satisfying Vu G (0, 1], tt^VFaIm) + nEWsiu) ^ 1, 
and u t—i- Wa{u)u, u ^ Wb{u)u continuous nondecreasing on [0, 

Since for any weight function sequence (W^™))^^ of W the function Gwc™) (u) = 
TTQau + -KiTTAFAioiuWAiu)) + TTi-KBFBipLuWBiu)) docs not depend of m, W is 
a class of converging weight function sequences. Moreover, G°°{u) = Gyv(™) (") 
is continuous and satisfies 2:-(G°°) > for A < X(G°°), either for W(™)(u) = 
w'^™' a weight vector, or for W(™^(u) = W*'(™)(u) the optimal weight func- 
tion (from one of the last statements of Theorem 4.2). Finally, we can apply 
Theorem 4.3 to obtain the oracle inequality (16). Moreover, the last assumption 
required for the FDR control (17) holds assuming that the limits W^(0~'') and 
W^(0+) exist (as is the case under assumptions {A1)-{A4), see Proposition 3.2). 

A second example is the continuous one-sided Gaussian setting^ where for 
all m, and 1 < « < ™, Fi{x) = $($ [x) — fi{i/m)'j, for a mean function 
/X : [0, 1] — + assumed continuous with fi{t) > for t > 0. In this context, we 
denote i^i/m and wj^J^ instead of Fi and wj;™'' for more convenience. Also note 
that the function t ^ Ft can be extended to all t in [0, 1]. In that setting, it is 
relevant to consider the following class of weighting: 

yV = I (W™')^^ weight hmction sequence such that 

Vue[0,l], -y^F,/,,,{auWl';2{u)) > / Ft{auWt{u))dt, 

z— 1 ^ 

for (VFt(-))te[o,i] ^ satisfying Vu, t e [0, 1] ^ 'Wt{u) continuous j-. 

Any weight function sequence (W^™) )„ of W is converging, because G-wim) (u) = 
TToaM+TTim-i YJi=\ Film{ccuW\™^{u)) G°°{u) = TToau+TTi Ft{auWt{u))dt. 
Moreover, expression (11) provides the form of the optimal weight function: 
auW*^'^\u) — (f>(^(i/m)/2 + c*^'"^ (u)/^(i/m)) where c^"^\u) is taken such 

that J2Zi = m. In Section 8.7, we prove that (W*'(™))„ belongs to 

W, with a "hmit weighting" (I^t*'(-))te[o,i] — *-* given by 
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where c°°(u) G R satisfies /o^^(^^ + ^^j;j^)dt = au. Similarly, any weight 
vector sequence of the form w^™) = W*'(™)(uo) (with uq fixed in (0, 1]) belongs 
to W with a Hmit function G°°(m) = noau + tti Ft{auWf {uo))dt continuous 
and strictly concave (implying Il{G°°) > for A < I(G°°)). Denoting 0*'°° 
the limit function of G'-w*.('")j we merely check that G*'°° > G°°. Since this 
holds for any choice of uq, we derive that [G*'°°) > for A < 1{G*'°°) (using 
inequalities similar to (25)). As a consequence, we may apply Theorem 4.3 to 
obtain the oracle inequality (16). In particular, the power of the multi-weighted 
procedure SU(W*''™'') is always asymptotically larger than the power of the 
weighted linear step-up LSU(w*^'"'') for any weight vector of the form w^™) = 
W*'('")(uo), with uo € (0,1]. Roughly, the latter signifies that SU(W^^(")) 
automatically chooses the best weighting among 

5. Simulation study 

An important point is now to evaluate the improvement of the new multi- 
weighted procedure, both when we plug the true parameters or misspecified 
parameters in the optimal weighting. For this, we propose to perform simula- 
tions in the - restricted but convenient - one-sided Gaussian testing framework 
under the conditional model. 

5.1. Simulations framework 

We consider the problem of testing for each i e {1, . . . ,m}, the null "/i; = 0" 
against the alternative "fii > 0" from the observation of m independent variables 
{Xi)i with Xi ~ J\f{iii,l). The parameters (H, F) of the (conditional) model 
are fully determined from the vector jj, = {iJ.i)i, namely by Hi — l{ni > 0} and 
(2), respectively. They represent informations of a different nature: H provides 
the location of the positive means while F supplies their values. 

For all our experiments, the number of tests is m = 1000. The vector fi is 
taken such that the toq — 700 first components of fi are equal to zero (the pro- 
portion of zeros in the mean vector is thus ttq — 0.7). The mi — 300 remaining 
non-zero means are taken in two different ways: 

• Case 1: the non-zero means increase linearly from :^'p to 3/1. 

• Case 2: the non-zero means are gathered in three groups of different values 
Jl, 2/1 and 3/1, of respective sizes 120, 120 and 60. 

In both cases /t is an "effect size" parameter taking values in the range 0.5-1- 
0.25A:,fce {0, ...,10}. 

The following procedures are considered: 

- [LSU] the linear step-up procedure LSU, 

- [LSU*] the step-up procedure with threshold collection au/iTQ, 

- [SU-W-oracle] the multi-weighted step-up procedure SU(W*) of Theo- 
rem 4.1, using the optimal weight matrix W* (given by (11)), 
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- [SD-W-oracle] the multi- weighted step-down procedure SD(W*) of The- 
orem 4.1, using the optimal weight matrix W*, 

- [Unif-oracle] the weighted hnear step-up procedure LSU(w*) using a weight 
vector uniform on Hi: w* = for /ii = and w* = m/mi for /x^ > 0. 

The procedures [SU-W-oracle] , [SD-W-oracle] and [Unif-oracle] correspond to 
the case where the weighting uses the true mean vector fi, hence the name "or- 
acle". In situations where we replace by a "guess" Jl in the weights, the pro- 
cedures are called "guessed" and are denoted by [SU-W-guess] , [SD-W-guess] , 
[Unif-guess] respectively. The procedure [Unif-oracle/guess] renders a uniform 
weighting over the (guessed) false nulls and is close in spirit to the approach 
of Gc^novcse et al. (200G). It takes only into account the subset where the hy- 
potheses are false ("location information"), but not the values of the non-zero 
means. 

The procedure [LSU*] is performed to compare with quite recent develop- 
ments on TTQ-adaptive procedures (see e.g. Bcnjamini et al. (2006)). Since it 
uses a perfect estimation of ttq, it represents the best theoretical TTQ-adaptive 
modification of the LSU that we can build. For clarity reasons, we avoid the 
problem of choosing a particular estimator of ttq and we only consider [LSU*] . 

All the latter procedures have provable FDR control (see Section 4.1), so 
that it is relevant to compare them in terms of power. In all experiments the 
targeted FDR level is either a — 0.01 or a ~ 0.05. The different performed 
procedures are compared in terms of relative power (RelPow) with respect to the 
LSU procedure, defined as the expected surplus proportion of correct rejections 
among the false nulls: for a multiple testing procedure i?, 

RelPow(i?) (mi)-i(E(|i? n Hi|) - E(|LSU n Wi|)). (18) 

Roughly speaking, this relative power represents the surplus "probability" of a 
false null to be rejected with respect to the LSU. This power is estimated us- 
ing Monte-Carlo simulations. Additionally, we also evaluate the "power range" 
defined by the power of the weighted linear procedures LSU(W*(uo)) for any 
uq G {1/m, 2/m, . . . , 1}. It is represented by a gray area over the pictures. 
Finally, the optimal multi- weighted step- up procedure SU(W*) without cor- 
rection (which controls the FDR when m — > oo) is also considered, but it is not 
reported on our figures, because its (relative) power is almost indistinguishable 
from the top of the power range. 

5.2. Procedures using the true parameters 

We report on Figure 2 the relative power (18) of [LSU], [LSU*], [SU-W-oracle], 
[SD-W-oracle] and [Unif-oracle] in function of the parameter Ji (1000 simula- 
tions). The gray area represents the power range as defined in the previous 
section. 

The conclusion of this experiment is that, in the most favorable case where the 
multi-weighting is used with the true parameters of the model, the improvement 
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Fig 2. Procedures using the true parameters. Relative power of [LSUJ (solid), [LSU*] (short- 
dashed), [SU-W-oracle] (dotted- dashed), [SD-W-oracle] (long-dashed) and [Unif-oracle] (dot- 
ted) in function ofji (see text). Left: case 1 of means, right: case 2 of means. Top a = 0.01; 
bottom a = 0.05. 

of the multi-weighted procedures over [LSU] is satisfactory. Also, [SD-W-oracle] 
performs here better than [SU-W-oracle] (especially for a = 0.05), so that the 
loss in the correction within [SU-W-oracle] seems significantly larger than the 
loss in the correction within [SD-W-oracle]. 

Furthermore, [SD-W-oracle] is more powerful than [LSU"^] (actually, this is 
still true using a smaller ttq, e.g. ttq = 0.5), and [SD-W-oracle] is always better 
than [Unif-oracle], and allows sometimes for much more discoveries. This seems 
coherent because [SD-W-oracle] takes into account more (correct) prior infor- 
mations than [Unif-oracle] : namely, [SD-W-oracle] uses both the values and the 
location of the non-zero means (we are in the conditional model), while [Unif- 
oracle] only uses the location information. 

Finally, the procedure [SD-W-oracle] is close to the top of the power range 
(gray area), that is, has a power close to the power of the best procedure among 
LSU(W'^(uo)), uq € {1/to, 2/m, . . . , 1}. This corroborates the optimaHty re- 
sults of Section 4.2 and Section 4.3 in this (conditional) setting. 
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5.3. Procedures using misspecified parameters 

We consider here the same experiment as before, except that we take into ac- 
count the "randomness" due to a prior guess Jli of each /i^. For this, we add a 
misspecification parameter a and we suppose that the guessed means are of the 
form: \fi G {1, . . . , to}, 

Jli = ^li + 

where £i are i.i.d with distribution A/'(0, cr^) (taken independent of the pi's). The 
misspecification parameter a is taken in the range {i/4, j = 0, . . . , 12}. Remark 
here that the way of guessing the mean is quite raw, because it does not take 
into account the specific form of the parameters (of course, this guessing can 
be improved here by taking local means). However, we keep this raw modehng 
here because we do not want to make any assumption on the parameters. 

Figure 3 reports the relative power (18) of [LSU], [LSU*], [SU-W-guess] , [SD- 
W-guess] and [Unif-guess] with respect to a. We performed 100 simulations to 
compute the relative power and the latter is moreover averaged over 10 generated 
values of the JiiS (for each values of a). 

In this experiment, we see that both multi-weighted procedures are better 
than other procedures when the guesses are good i.e. over the range a G [0, 1.2], 
but may be worst than the simple [LSU] procedure when a is large. Further- 
more, note that the procedure [Unif-guess] quickly collapses when a grows and 
therefore only proposes a slight improvement of [LSU] (or [LSU*]) when the 
guesses are good. However, it is "less risky" than the multi-weighted proce- 
dures for large a. Again, this conclusion is natural because the multi-weighted 
procedures take here into-account more prior information than [Unif-guess] . 

Finally, although admittedly of a limited scope, these experiments show that 
in principle, taking into account a correct guess of the parameters in the multi- 
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Fig 3. Procedures using misspecified parameters. Relative power of [LSU] (solid), [LSU*] 
(short- dashed), [SU-W-guess] (dotted-dashed) , [SD-W-guess] (long-dashed) and [Unif-guess] 
(dotted) in function of the misspecification parameter a (see text). Left: case 1 of means, 
right: case 2 of means (see text). Ji = 1; a = 0.05. 
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weighted procedures should improve the power substantially. The loss/gain mag- 
nitude of these procedures depends on the quantity of prior information used 
(location of the positive means, value of the positive means, or both). 

6. Application to mRNA and DNA microarray experiments 

In a typical microarray experiment, we want to find differentially expressed 
mRNA genes between two groups of individuals. For the i-th gene, the level 
expression is of the form Xi^i, . . . , Xi^ki for group 1 and y^.i, . . . , Yi^i. for group 
2, where ki (resp. ii) is the number of individuals in group 1 (resp. group 2). 

In some microarray experiments, the sample sizes {ki,£i) available to assess 
the differential mRNA expression of gene i may strongly depend on i, e.g. when 
the number of missing data differs per gene. In this application, we consider 
a covariate, the DNA copy number status of the same gene, which determines 
the groups and the sample sizes. DNA copy number status is obtained from an 
independent array CGH experiment, after a few pre-processing steps (see e.g. 
Picard ct al. (2007)). We focus on the covariate Aij which is equal to 1 when 
gene i is gained for individual j (i.e. when sample j has an abnormally high DNA 
copy number of gene i), and otherwise. The biological goal behind this is to find 
the genes for which the mRNA expression is induced by the DNA copy number. 
This is particularly useful to study cancer pathologies (see e.g. Hyman et al. 
(2002)). Sample size dependent weights are in particular attractive here, because 
many genes show a large unbalance in the amount of gains (defining group 1) 
and non-gains (defining group 2). 

Using the above framework, we analyze microarray lymphoma cancer data 
of Muris ct al. (2007). In these data m = 11 169 genes and n = 42 individuals 
are studied. The p-value of each gene was computed using a Mann- Whitney 
test. We aim to consider as prior the sample size information only, without 
any guess on which hypotheses are false or true. The asymptotic normality of 
the Mann- Whitney test statistic is used to define asymptotically optimal multi- 
weights W* which depend only on {ki,£i) and an estimate for the global effect 
9, which is a gene-independent parameter for the effect of copy number gain on 
mRNA gene expression. The expression of the multi-weights and the estimate 
for 6, 9m, are detailed in Roquain and van dc Wicl (2008). The estimator 9m 
converges in probability when m grows to infinity, so that we believe that the 
fluctuations of 9m in the weights will have a marginal effect on the effective FDR 
of the so multi-weighted procedure when m becomes large (however we did not 
investigate formally the corresponding asymptotic study for now). 

We apphed the step- up multi- weighted procedure SU(W*), using the esti- 
mator dm — 1-01 of the global effect size 9. Since m is large we focus on the 
unmodified version of our procedure, which guarantees asymptotic FDR control. 
For different values of a, the number of discoveries of this procedure and of the 
LSU are given in Table 1. 

We observe that our new step-up procedure discovers more differentially ex- 
pressed genes when a € {0.005,0.01}. For a G {0.05,0.1}, the performance of 
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Table 1 

Number of discoveries for the new step-up and the standard linear step-up 



a 


LSU 


SU(W) 


0.005 


43 


98 


0.01 


121 


156 


0.05 


478 


476 


0.1 


836 


859 



the two step-up procedures is similar. So, the improvement of our procedure 
is here mostly noticeable when the proportion of rejections is small. This is in 
accordance with our intuition: the prior information (here the sample sizes) is 
particularly useful when the proportion of rejections is expected to be small. Fi- 
nally, let us remark that these positive results on the sample size problem have 
been corroborated in a specific simulation study as well (not reported here). 

7. Conclusion and discussions 

When the parameters of the p-value model are known, we proposed to solve 
the problem of the LSU optimal weighting by finding a new procedure which 
provably outperforms all the weighted LSU procedures (up to small error terms) 
and which can be easily computed from these parameters. Our simulations il- 
lustrated the strength of the improvement of our new approach in situations 
where it uses the true or misspecified parameters. 

In our results, the assumptions concerning the marginal distributions of the 
j>-values were quite mild: the FDR control only required that each p-value is 
uniform under the null while the optimality results only required the strict con- 
cavity of the c.d.f.'s of the p-values. Moreover, the existence of the optimal 
weight function only asked to maximize simultaneously the power at any pro- 
portion rejection, and we gave strong sufficient assumptions for its existence and 
unicity. 

Several extensions to this work are possible: first, we have supposed the inde- 
pendence between the p- values all along the paper, which is a standard but some- 
what unrealistic assumption for the applications. In Roquain and van de Wicl 
(2008), we proposed some extensions of the present FDR control results to the 
case of positively regressively dependence or unspecified dependence. However, 
the so-derived procedures seemed too conservative for practical use. Therefore, 
there is a room left for future investigations, which join the very active (but 
challenging) research field studying the impact of p- value dependence on FDR 
control (see e.g. Kim and van dc Wicl (2008); Romano ct al. (2008)). 

Second, our FDR controls are done at level smaller than noa (asymptotically, 
in the unconditional model). Therefore, when ttq is small, our procedures are 
inevitably conservative, because their actual FDR is much lower than the fixed 
target. This is a classical problem for the LSU procedure and several works 
have been proposed to address this issue, by integrating a TTQ-estimate in the 
threshold, building so-called adaptive LSU procedures (see e.g. Beujaniini ct al. 
(2006); Blancliard and Roquain (2009)). A possible interesting extension of our 
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work could therefore be to derive adaptive multi-weighted procedures, which 
would increase the power when the data contain a lot of signal. 

A third - and maybe more important - direction for future works is the inves- 
tigation of data-driven weighting. A first idea could be to replace the function 
Pow„(-), the power at rejection proportion u, by an empirical substitute and to 
perform the simultaneous maximization with this substitute. This would yield 
an empirical optimal weight function W* that can in turn be integrated in a 
multi-weighted procedure. While this certainly requires to use a model with 
some replications, the theoretical FDR control and power optimality of such 
data-driven procedure are not straightforward from the present work, because 
all our proofs here use the fact that the weight functions are deterministic. 

8. Proofs 

8.1. Useful notation and lemmas 

Let us first introduce the following notation that will be useful throughout our 
proofs: if R is the step-up procedure associated to a given weight function W of 
associated threshold collection A, and u :— \R\/m its rejection proportion, that 
is M = X(Gw)j we denote by: 

1. R^i the step- up procedure on the set of hypotheses corresponding to 
{1, . . . , r7i}\{i}, that is excluding the i-th null, and associated to the 
threshold collection \/j ^ i, Vu, Aj((l — m~^)u); and we denote by := 
\R-i\/{m—l) its rejection proportion, so that Ui = X(G_i) with G_i(u) := 
(m - E,^^ l{p, < A,((l - m-i)u)}; 

2. the step-up procedure on the set of hypotheses excluding the «-th null 
associated to the threshold collection Vj ^ i, Vw, Aj((l — ■m~^)u + m~^); 
and we denote w'^j :— \R'_^\/{m — 1) its rejection proportion, hence u'_^ = 

with G'_^{u) := (m-l)-i^^._,,,lfe- < A,((l - to-1)m + m-i)}. 

Similarly, when R is step-down, we define and R'_^ as step-down procedures 

and we denote u := i/(G-w), w-j u'_^ := J('G'_^) instead of m, 

respectively. 

The two following lemmas make a link between the rejection proportions of 
i?, R-i and for different values oipi- They are proved in Appendix B and 
are related to Lemma 10.20 of lioquain (2007). 

Lemma 8.1. Let R he the step-up procedure associated to a given weight func- 
tion of threshold collection A and consider u, u^i and u'_^ as above. Then we 
have point-wise: 

1. pi<Ai{u) -^^^ < Aj((l-TO"i)?2'„j-|-TO~i) u ^ {l-m^^)u'_.^-i- 

2. Pi > Ai{u) u ~ (1 — m^^)u^i . 
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Lemma 8.2. Let R be a step-down procedure associated to a given weight func- 
tion of threshold collection A and consider u and u^i as above. Then we have 
point-wise, for any k G {1, . . . , m}, 

1. u> k/m and pi > Ai({k — l)/m) w_i > (fc — l)/(m — 1) ; 

2. u-i> {k -l)/{m -I) and Pi < A.i{{k ~ l)/m) =4> u>k/m; 

3. Pi > Ai{{l - m^^)u^i + m~^) =^ u = {1 - m^'^)u-i . 



8.2. Proof of Theorem 4-1 step-up part 



The inequalities are established in the conditional model (the result in the un- 
conditional model directly follows). 

We use in all our FDR bounds that a procedure R satisfying the "self- 
consistency condition" R = {i \ Pi < Ai{\R\/m)} has a FDR equal to 



FDR(i?) = E 



IRnHo 



\R\W1 



l{p, < X{\R\/m)} 



R 



(19) 



Now, consider the multi- weighted step-up procedure R = SU(W) of Theo- 
rem 4.1, and denote by A the threshold collection associated to W: Ai{k/m) = 
aWi{k/m)k/m = aWi{k/m)k/m{l -\- aWi{l))~^ < 1. Since any step-up pro- 
cedure satisfies the self consistency condition, we may use (19). Furthermore, 
using the notation of Section 8.1 and applying Lemma 8.1 (first statement), the 
assertion pi < Ai{\R\/m) = Ai{u) is equivalent to u = (1 — m~^)u'_ 
Thus, we may rewrite the FDR as follows: 

'l{p,<A,{u)} 

y^ — -t-ii)ii^ — 

u m 

i=l L 



FDR(i?) = ^(1 - iJ,)E 



^(1 - H,)'^k-^F[p, < A,{k/m),um = k] 

i=l k=l 
rri m 

^(1 - H^) k-^F [p, < A,{k/m), (m - l)u'_, I ^ k] 



1=1 



fc=i 



Then, since u'_^ only depends on the p- values of {pj,j ^ «), it is independent of 
Pi and we obtain 



FDR(i?) = — ^(1 - Hi) J2 Wi{k/m)F [(m - l)u'_^ + 1 = k] 
1=1 fe=i 

m m 

= - 5^(1 - H,) W^ik/m){l - A,(1))P [(to - l)u'_., + 1 - fc] 



/c=l 



- y (1 - H,) y W^ik/m)F [p, > A,(l), (m - l)u'_^ + I = k] , 

(20) 



fe=i 
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where we used that pi has a uniform distribution on [0, 1] (from (1)). Next, con- 



sider the threshold coUection \/j e {1, , 



i}, Vu, A' (u) = Aj((l - m-i)u + 



m~^) and the associated step-up procedure that we denote by R'. Let us also de- 
note its rejection proportion by u' = \R'\/m. From the definition of Section 8.1, 
the restriction of B' to the hypothesis set corresponding to {1, ... , m}\{i} is ex- 
actly the procedure Therefore, from Lemma 8.1 (second statement applied 
to R'), the condition pi > Ai(l) = A-(l) > A-(u') implies mu' = (m — l)^'-^. 
Therefore, 



FDR(i?) < — V(l - HAy^W.Ak/m)F\pi > A,(l),mM' -I- 1 = fc] 



k=l 



<-y 



^(i-i?,)w^»(fcM 



P [rnu' + 1 = /c] 



<a max i V(l - i7,)Wi(fc/T 



1=1 



8.3. Proof of Theorem 4-1 step-down part 

Again, it is sufficient to look at the conditional model. First, let us prove that for 
any step-down procedure R with threshold collection A and rejection proportion 
u, we have for any i, 

m m 

FDR(i?) < ^(1 - H,) J2 tP [{m - 1)^-., = k-l,p,< A,{k/m)] , (21) 



fe=i 



where U-i is the rejection proportion of the step-down procedure associated to 
A and restricted to the hypotheses different from the i-th hypothesis as defined 
in Section 8.1. This result has been implicitly proved in Gavrilov et al. (2009) 
(Section 3), using a specific non- weighted step-down procedure. Here, we state 
(21) in a more general framework. Applying the two first points of Lemma 8.2, 
we obtain the following relations: 

m ^ 

V] -P [mu = k,pi < Ai{k/m)] 

^ - [P [mu = k,p, < A,{{k - l)/m)] 
fe=i 

+ P[mu = k,Ai{{k- 1)/to) < < Ai{k/m)] 

m ^ 

V-P[tou> k,Ai{{k- l)/m) < pi < A,{k/m)] 



fc=i 



k=l 



E 

k=l 



l{k > 1} _ 1 
k-1 k 



[mu > k,pi < Ai((k — 1)/to)] 
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< ^ -P [(m - l)u_, >k-l, A,{{k - l)/m) <p,< A,{k/m)] 

k=l 



E 

k=l 



l{k > 1} _ 1 

k-1 k 



• [{m - >k^l,p,< A,((fc - l)/m)] 



m ^ 

= ^^P[(m-1)M_, = A:-l,p, < A,(fc/m)]. 
fe=i 

As a consequence, the latter combined with (19) states (21). 

Now, consider the step-down procedure i? of Theorem 4.1, that is, associated 
to the threshold collection Ai{k/m) = aWi{k/m)k/m = aWi{k/m)k/m{l + 
aWi{k / m)k / m)^^ < 1. We use the independence between the p-values and (21) 
to show 

m m ^ 

FDR(i?) < ^(1 - H,) [{m - l)u_, = fc - l,p, < A,{k/m)] 

i=l k=l 

m m 

= - y (1 - H,) y W,{k/m){l - A,(fc/m))P [(m - 1)m_, = fc - 1] 

i=l k=l 
m m 

= - y (1 - H,) y W,{k/m)P [(m - l)u_, = - l,p, > A,{k/m)] . 

i=\ k=l 

The third point of Lemma 8.2 thus implies 



FDR(i?) < — y(l - H^)y Wi(k/m)F \mu = k-l,pi > AJk/m)] 

i=l k=l 
m r m 

< - y y{l-H,)W.,ik/m) P 

fe=l L i=\ J 
( m 

<a max <m^^y'{l — Hi)Wi{k/m) 



= fc - 11 



8.4. Proof of Theorem 4.2 

Let us assume that the following proposition holds (the proof is given at the 
end of this section): 

Proposition 8.3. In the unconditional model, consider a weight function W 
with its associated threshold collection A and put u — X(Gw)- Then the follow- 
ing holds: 

(i) assuming that for all u' > u > u, u' — G'w(u') > u — G'w(w), we have for 
all X> 0, \ < 1 - u, 



E. Roquain and M. van de Wiel/Optimal weighting for FDR control 



701 



Pow(SU(W)) - (1 - a7ro)M 

< TTim^ exp { - 2to(I+(G'w) - m"^)^} - l|(Gw) + A(l - a-Ko); (22) 
(ii) assuming A < 1, u;e have for all X > 0, X < u, 
Pow{SV{W)) - (1 - a7ro)M 

> -TTiwexp { - 2m{I^{Gwj)l} +I^{Gw) - A(l - aTio). (23) 

We now prove Theorem 4.2 by applying Proposition 8.3. First, remark that, 
in the unconditional model, we have for any weight vector w, 

Gw(u) = anou + Pow„(w), 

so that maximizing in w the power at rejection level u is equivalent to maximize 
Gw(u) in w. As a consequence, taking the optimal weight function W*, we 
deduce from (9) that for any weight vector w and for any u we have Gw(u) < 
G-w*(u). Denoting :~T{G^) and u* :— 2{G-w*), this in turn implies that 

Uw < u* (24) 

Second, remark that W* has a threshold collection A* satisfying A* < 1. The 
latter holds because the F^'s are increasing (as non-decreasing strictly concave 
functions), and because A*(u) < aW*{l) with W*(l) maximizing the power at 
rejection proportion 1. Third, we check the assumption of (i) Proposition 8.3 for 
W(-) constantly equal to a weight vector w, which directly follows from the strict 
concavity of Gw (itself coming from the strict concavity of the F^'s). Forth, let 
us prove that X^(Gw) > and X^(Gw*) > 0. The first statement comes from 
the definition of I(Gw)- To prove the second statement, consider u* ~ X(Gw*) 
and the weight vector w = W(m*), so that u* is equal to — X(Gw) (because 
Uw < u* from (24)). Using again that W* is a maximum, we obtain 

Gw* {u* - A) > Gw(m* - A) = Gw(mw - A) - Gw(^w-A) _ 

liw A 

> ^'"^""^\ ^w-A)=^*-A, (25) 

by the strict concavity of Gw This implies X_^(Gw*) > 0. 

Finally, using (22) with W(-) constantly equal to any weight vector w, to- 
gether with (23) used with W — W*, we obtain for all A > 0, A < u* and 
A < 7ro(l - a), 

Pow(SU(W*)) 

> (1 — aiTo)u* — e(TO,X^(Gw*)) ^ A(l — airo) 

> (1 — a7ro)Mw — £{m,2^ (Gw*)) — A(l — ana) 

> Pow(LSU(v^f)) - e(TO,X+(Gw)) - e{m,J^ (Gw*)) - 2A(1 - aira), 



which proves (15). 
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Let us now prove Proposition 8.3. Using that the procedure R = SU(W) 
satisfies the self-consistency condition R = {i \ Pi < Ai{\R\/m)}, Lemma 8.1 
(first statement) and the notation of Section 8.1, the power of the procedure R 
may be expressed as follows; 

ni 

Pow(SU(W)) = 7rim"i^P[p,; < A,(|i?|/TO) \ Hi ^ 1] 

i=\ 
m 

= TTim-i ^ P [p, < A,((l - m-i)u'_, + m-i) \ = l] 



1=1 



< TTim^i^E [i^, o Ai((l 



(26) 



i=l 



where we used both the independence between pi and conditionally to H 
and the independence between u'_^ and Hi. For simplicity, we introduce the 
increasing function 0(u) := {l — m~^)u + m~^ of invert (l)~^{v) = (mu — l)/(m — 
1). Fix now A > 0, with u + A < 1. Expression (26) may be rewritten as 

Pow(SU(W)) - (1 - aTTo)u 



< E 



TTim ^ ft o Ai{(l){u'_i)) - (1 - a7ro)-i 



< 7riP(f^i) + G'w(w + A) - Q!7ro(?2 + A) - (1 - aTTo)?! 
= TriP(f}J) + Gw(S + A) - (m + A) + A(l - ano), 

where fii denotes the event {V«, 1 < i < m, 4'{u'_i) < u + X} and where the last 
inequality comes from the definition of u. We upper-bound now the probability 

of ni: 



E 1{"(™-I)e{0,l,. 

2^1 U>(f)-^{u-\-X) 

m 

= E E Hvme {l,2,...,m}}F 

i—1 v>u-{-X 
m 

<E E l{^^"^e {l,2,...,m}}P 

2 — 1 V>U--\-X 



,m- 1}}P G'_,{u) > u 



(v) > V — m 



where the last inequality uses that mGw(v) > {m — l)Gr'_i{(j) ^{v)). As a con- 
sequence: 

p [ni] < TO p [gw 

v>u-\-X 
vmG{l,2,...,m} 



{v) — G\v{v) > V — G\v{v) — m ^ 
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'w('i') — Gwiv) > (m + A) — Gw(u + A) — 



— ™ 5Z 

v>u-\-X 



ie{l,2,...,m} 



< exp { - 2m((u + A) - Gw{u + A) - m"^)^}, 



(27) 



where we used successively the assumption in (i) of Proposition 8.3 and Ho- 
effding's inequality (see Hoc^ffding (19f).'>)) for the last inequality. This finally 
yields (22). 

The point (ii) of Proposition 8.3 is similar: noticing that (26) is an equality 
when A < 1, we obtain 

Pow(SU(W)) - (1 - aTro)u 

> -7riP(f7§) + (Gw(S - A) - (u - A)) - A(l - airo), 
with = {Vi, 1 < i < m, (j){u'_^) > u — A}. Next, we have 

rn 

P(r!^)<^P[u'_, <0-i(S-A)] 

rn 

= ^p[g'_,(0-1(m-A)) < (j)~^{u-\) 

i=l 
m 

< ^P [Gw(w- A) < u - A , 

i=l 

where the last inequality uses that m(Gw(i* — A) < (m — l)G'_i{(j)^^{u — A)) + 1 
and thus (t)~^{Gw{u — A)) < G'_i{(t)^^ {u — A)). As a consequence, we obtain 

P(f7^) < mP Gw(u - A) < u - A 

= mP Gw(w — A) — G-w{u — A) < u — X — Gw(w ^ A) 
< m exp{— 2to(u — A — Gw('S — ^))+}: 
which implies (23). 

8.5. Proof of Theorem 4.3 

First remark that for any weight function sequence of W, the convergence of 
(G-wr(m) )m to G°° is uniform, because all these fonctions are non-decreasing 
and because G°° is assumed to be continuous on [0,1]. Next we prove that 
X(Gw(™)) ^ I(G°°). (This wiU imply directly that X+(Gw(™)) ^ 2:+(G°°) 
and Z^(G-w("')) — * for A < J(G°°).) For this, take a subsequence m' 

such that X(G.^(„,')) converges and prove that its limit I is equal to T{G°°). 
From the uniform convergence and the continuity of G°° , £ satisfies G°° {£) — £. 
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If X(G°°) = 0, the only possible fixed point of G°° is and £ = 0. If I(G°°) > 0, 
and X{G°°) are the only possible fixed points of G°° (because 1^{G°°) > 
for A < J(G°°)). Next, we have G^i™') (2:(G°°)/2) > I(G°°)/2 for large m' 
(because G°°(J(G°°)/2) > X(G°°)/2 ) and thus £ > X(G°°)/2 > 0, which in 
turn implies i = I{G°°). 

Fix a sequence of weight vector (w*^™^)™ belonging to W. Wc aim now to 
prove: 

limPow(SU(W*'(™))) ^ (1 -7roa)lim{X(Gw*.(-))} (28) 

ni m 

limPow(SU(w("))) = (1 -7roa)lim{X(G„(™))} (29) 

m m 

Expression (16) will then directly follow from X(Gw* ('n) ) > I{G^(,n) ) (as stated 
in the proof of Theorem 4.2). 

Let us state now (28) (the proof for (29) is similar). Fix A > with A < 
7ro(l — a). Applying Proposition 8.3, we obtain that 

|Pow(SU(W*'(™))) - (1 - 7roa)X(Gw*.(™))| 

< e(m,X+(Gw*.(™))) +e(™,2:A (Gw*.(™)))l{A < I(Gw*,(..))} - 2A(1 - Ti^a). 

First, denoting G*'°° = lim„Gw*.(™), we have X+(Gw*,(")) ^ X+(G*^°°) > 
and thus lim„e(m,X+(Gw*.(™))) = 0. Second, if I(G*^°°) > 0, we have 
j-(-gr*,oo) > for A < X(G*^°°) and thus lim„ e(m, I" (Gw*. (-.))) = for 
A < J(G*'°°). If I(G*'°°) = 0, we trivially have that 1{A < I(Gw*,(".) )} is 
equal to zero for m large. As a result, we obtain for A small enough that 

I < -2A(1 - TToa). 

This yields (28) by letting A — > and by noticing that lim„i {X(Gw*.("i) )} exists. 
Finally, we have to check that the use of W*'^'") in Proposition 8.3 was allowed, 
i.e. that for all m and u' > u > u* :— X(G-vv*,(m) ), inequality u' — G-w*,(m) (u') > 
u — G-w*.(™) (w) holds. For this, we let w := W*'(™)(u') and Mw := 2^(Gw)- Since 
u* > Mw and Gw(uw) = Uw we have u' — Gvr{u') > u — Gw(w) (Gw being 
strictly concave). Therefore, for this particular weight vector w, 

u' - Gyv*.(">) {u') = u' - Gw(u') > u~ Gw(u) > u - G-w*.(">) {u), 

where the last inequality holds because W*''^™^ is a maximum. 

Finally, to get the FDR statement (17), we use the same reasoning as above 
combined with the following finite FDR approximation result: 

Proposition 8.4. In the unconditional model, consider a weight function W 
with its associated threshold collection A and put u = I(Gw)- Assume that for 
all u' > u > u, u' — G-w{u') > u — G-w{u) and take A > with A < 1 — u. Then 
the following bounds hold: 



limsup I Pow(SU(W*'(™))) - (1 - 7roa)X(G 
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FDR{SV{W)) < 7roa + 7roam3exp{-2m(X+(Gw)-m-i)^} 

■exp{ -2TO(X^(Gw))i} + 



sup {W,iu)}-1 



+7roal{-it > A} 

+7roal{M < A} 

Assuming additionally A < 1, we have 
FDR{SV{W)) > 7roa-7roam3exp{-2m(X+(Gw)-TO-i)^} 



2A 



u — A 



0<M<2A 



(30) 



-TrQal{u > A} 
-7roQ;l{M < A} 



■exp{-2TO(X-(Gw))+} + 



l-m"^V inf {WM} 



2A 



u + X 



(31) 



To prove Proposition 8.4, we write the FDR as (using the same reasoning 
and notation as in Section 8.4), 



FDR(SU(W)) =7ro^E 

1=1 

m 

1=1 



Ife < A,{\R\/m)} 



\R\ 

lfe< A,(0(S'_J)} 



A.(0(u'_,)) 



1=1 

+ 7roam(P [Q^] + F [fl^^]) . 
On one hand, when u > A, we may write 



l{17inr22} 



(32) 



A.(0(y .)) 



i{rji n 



Ai(u + A) 



u — A 



= a + 2a 



u — X 



On the other hand, when w < A, we have ^2 — $ and 



< am sup {Wi{u)} 



0<u<2X 



This imphes (30). The proof for (31) is similar, by noticing that (32) is an 
equahty when A < 1. 
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8. 6. Proof of Proposition 3. 2 

Let t = au and assume {Al)-{A2)-{A?>) . Consider first the conditional model. 
Following the constrained Lagrange multiplier method, the problem is to max- 
imize in (A, w) the function: 

L(A,w) = Y -^'(^^^) -^{Y.""^- 

Assume that (wi); is a critical point, i.e. that for each i: ^^-(A, w) — tfi{wit) — 
A = 0, so that Wi = t^^/j^^ (Ai^^). Then, A is chosen such that ^.iWi — m 
i.e. A = ^'^^{t) for = m~^J2jeHi fj^^v)- Hence, we find that the only 

possible critical point is VijWi — W*{u). To conclude, it is sufficient to prove that 
{Wl{u))i is a maximum. The latter holds because for each i, fi is decreasing, so 
that |;^(A,w) = t^f[{wit) < 0. Therefore, since auW*{u) = (^~^{au), where 

Qix) — '9{fi{x)) — J2jeHi /r^(/i(^)) is ^ differentiable increasing function 
from (0, 1) to (0,7ri), we easily check that W* satisfies (8) and is continuous. 
Next, assuming in addition (A4), we obtain for all i, 

lim W:{u) = hm ^lV^ = lim -±- = hm ^ J'' '^^^ ■ . 

which exists in [0, m]. Finally, the results in the unconditional model follow from 
the same reasoning as above by replacing Tii by {!,..., m}. 



8.7. Proof for the continuous Gaussian case of Section 4-3 

Fix u e (0, 1]. We aim to prove that 



^{i/m) 



2 y.{t) 



dt. 



(33) 



First, we have c'^'"^(m) c°°{u), because c^'"^(u) ~ '^.^-^^{au) and c°°(u) = 
^~^{au) where *m(x) = ^ SHi *(M(*/"^)/2 + a;/^(i/r7i)) and where ^'{x) = 

Jq $(/i(i)/2 + a;//i(t))dt are decreasing continuous functions such that "iim con- 
verges uniformly to Second, we have for any e > 0, 



pL{i/m) c(™)(u) 
2 i.i(i/m) 



i=l 

< m^'^\{i I fi{i/m) < e}\ + 



fj.{i/m) c°°{u) 



li{i/m) 



|c(™)(u) - c°°(u)| 
e\/27r 



where we used that $ is l/\/27r-Lipschitz. Since the measure m ^ YTi=i^i/m 
converges weakly to the Lebesgue measure A on [0, 1], we derive the inequality 
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limsup„j(r7i | ^{i/m) < e}|) < A{t \ fi{t) < e). By assumption, /i is positive 
on (0, 1] so that the latter converges to as e tends to 0. This imphes (33). 

Appendix 

Appendix A: Practical implementation of the new procedures 

Algorithm A.l. (Step-up algorithm for SU(W) ) 

- Step 1: compute for each i the weight vector {Wi{l))i and the weighted 
p-values p'j^ = pi/Wi{l). If all the weighted p-values are less than or equal 
to a, then reject all the null hypotheses. Otherwise go to step 2. 

- Step j {j > 2) ; put r — m — j + 1 and u — r/m and compute for each i the 
weight vector {Wi{u))i and the weighted p-values p[ = pi/Wi{u). Order 
the weighted p-values following p'^^^^ < • • • < p[m)- V P'(r) — "^"^ then reject 
the r null hypotheses corresponding to the smaller weighted p-values p'^^-^ , 
1 < i < r. Otherwise go to step j + 1 (if j ~ m stop and reject no null 
hypothesis) . 

Algorithm A. 2. (Step-down algorithm for SD(W) ) 

- Step 1: compute for each i the weight vector {Wi{l/m))i and the weighted 
p-values p[ — pi/Wi{\/m). If the smallest weighted p-values is strictly 
larger than a/m, then reject no null hypothesis. Otherwise go to step 2. 

- Step j {j > 2): put r — j , u = r/m and compute for each i the weight vec- 
tor {Wi{u))i and the weighted p-values p'^ — pi/Wi{u). Order the weighted 
p-values following p'^-^^ < • • • < ^ P'(r) ^ then reject the r — 1 
null hypotheses corresponding to the smaller weighted p-values p'^^iy 1 < 
i < r — \ . Otherwise go to step j + 1 (if j = m stop and reject all the null 
hypotheses). 

Appendix B: Proofs of technical lemmas 

Proof of Lemma 8.1. Let us first prove the first point. Denote A^ (u) = ((1— 
m^^)u + m^^), and (/)(«) := (1 — m~^)u + m^^ with — {mu — l)/{m — 

1), so that Aj(u) = A^-(0-1(m)). Since toGw(m) = (to - l)G'_i((/)~Hu)) + 
< Ai(7i)} the following equivalence holds when pi < Ai{u): 

G,w{u)>u ^ G'_^{<j>-\u))>(l)^\u). (34) 

First, assuming pi < Aj(u), equivalence (34) used with u — u leads to inequal- 
ity i^i)) > (f>^^{u) and thus (j)~^{u) < u'_^ because u'_^ is defined as 
a maximum. This imphes < Ai{u) = A'^{(j)-^{u)) < A^(u'_J. Conversely, 
assuming pi < A-(u'_j) = Ai{(j){u'_^)) , equivalence (34) used with u — 4'{u'_i) 
yields Gw(0(M'_i)) > 4'{u'_^) and thus 4'{u'_i) < u by definition of u. The first 
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point is thus proved by additionally noticing that when both pi < A.i{u) and 
Pi < ^i{u'_i) we have both u'_^ > (j)~^(u) and 4>{uLi) < u, so that 4>iu'-i) — u. 
For the second point of the lemma, remark that 

mGw(u) = {m — l)G-i{um/{m — 1)) + l{pi < Ai(u)}. (35) 

Therefore, we always have Gw(w) > u <f= G-i{um/{m — 1)) > um/ (m — 1), 
which implies, using u — {1 — m~^)u-i, that u > {1 — m^^)u^i always holds. 
Next, when pi > Ai(u), we have Gw(m) > u =4» G^i{um/{m — 1)) > 
um/ (m—l), so that taking u = u in the relation above leads to m < (1 — TO~^)?2_i 
and thus u = (1 — m~^)u^i. Conversely, if pi < Ai(u), from the first point of 
the lemma we obtain u — {1 — m~^)u'_^ + ra^^ and since u'_^ > U-i (because 
pointwise < we deduce u > (1 — m~^)u_i which finishes the proof. □ 

Proof of Lemma 8.2. For the first point, take u' < (fc — l)/(m — 1) and 
apply (35) with u — {1 — m~^)u', which gives m(G-w((l — 'm^^)u') — {ra — 
l)G_i(?i') + l{pi < Ai((l — m^^)u')}. Since (1 — m^^)u' < {k—l)/m, assuming 
u > k/m and Pi > Ai((fc— l)/m), we obtain > Ai((l — m^^)u') and Gw((l — 
m~^)u') > (1 — m^^)u', which thus leads to G-i{u') > u' . Since this holds for 
any u' < [k — 1)/(?ti — 1), we finally have w_i > (fc — l)/(m~ 1). 

To prove the second point, take u' < (fc — l)/m and use (35) with u = 
u' . This gives mGw(w') = (to — l)G_i(M'm/(TO — 1)) + l{pi < Ai{u')}. Since 
u'm/{m- 1) < (fc - l)/(m- 1) and u' < {k - 1)/to, if > (fc - l)/(m- 1) 
and Pi < Ai((fc— 1)/to), we obtain Gw(u') > u' + m^^ > u' . This holds for any 
u' < {k ~ l)/m and also for u' = k/m because G'w{k/m) > Gw((fc ^ 1)/to) > 
(fc — 1)/to + m~^ = k/m. Finally u > k/m. 

For the third point, remark that since we trivially have u > {1 — m~^)u-i 
(with an argument similar than in the step- up case), it is sufficient to prove 
■u < (1 — m~^)u-i. For this, use (35) with u' = {1 — m~^)u-i + m~^ , leading 
to toGw('"') — {m— l)G-i{u'm/{m — 1)) + l{p,; < Ai{u')}. Since R-i is step- 
down and since u'm/{m — 1) = u-i + (to — 1)^^ we have by definition of S_i 
that G-i{u'm/{m — 1)) > u'm/{m — 1). Therefore, assuming pi > Ai{u'), we 
obtain that Gw(w') > u', meaning that u < u' because R is step-down. Hence 

U < (1 — TO^^)M_i. □ 

Appendix C: Some FDR bounds for SU(W) and SD(W) 
C.l. Step-up case 

Lemma C.l. Consider the conditional model in the situation where only two 
true hypotheses are tested, that is, m = toq = 2. Then for any weight function 
W, the procedure SU(W) has a FDR equal to a + a'^{l - Wi{l)){Wi{l) - 
Wi(l/2)). 
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In particular, in the conditional model with m = toq — 2, the FDR in the 
above lemma has a maximum equal to a + attained e.g. for the weight 

function VFi(l/2) = 0; W2{l/2) = 2; Wi{l) ^ 0.5; 14^2(1) = 1.5. Additionally, 
in the unconditional model, the FDR of the above procedure is larger than 
7ro(a + and can thus be larger than a when ttq is (very) close to 1. 

C.2. Step-down case 

The next result states that SD(W) control non- asymptotically the FDR without 
correction in the case m — 2 and when toq = to in the conditional model. This 
is quite intriguing and we may think that SD(W) controls the FDR for any m 
and Too- 

Lemma C.2. For any weight function W, the procedure SD(W) controls the 
FDR at level a in either of the two following cases: 

(i) in the unconditional model when all the hypotheses are true, that is mp = 

TO, 

(ii) in both conditional and unconditional model when m ~ 2. 

To prove (i), we easily check that, when all the hypotheses are true, the 
FDR of SD(W) is 1 - P[Gh/(1/to) = O] and is thus equal to the FDR of 
LSD(W(1/to)), which is equal to a from results on weighted linear step down 
procedures (see e.g. Blanchard and Roquain (2008)). To prove point (ii), we 
just have to check the case too = 1 from point (i). This trivially holds from (20) 
(which also holds in the step-down case), because all the weights are smaller 
than TO = 2. 
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